Finite size e ects in on - line learning of multi - layerneural networks
نویسندگان
چکیده
We complement the recent progress in thermodynamic limit analyses of mean on-line gradient descent learning dynamics in multi-layer networks by calculating the uctuations possessed by nite dimensional systems. Fluctuations from the mean dynamics are largest at the onset of specialisation as student hidden unit weight vectors begin to imitate speciic teacher vectors, and increase with the degree of symmetry of the initial conditions. Including a term to stimulate asymmetry in the learning process typically signiicantly decreases nite size eeects and training time. Recent advances in the theory of on-line learning have yielded insights into the training dynamics of multi-layer neural networks. In on-line learning, the weights parametrizing the student network are updated according to the error on a single example from a stream of examples, f ; ()g, generated by a teacher network ())1]. The analysis of the resulting weight dynamics has previously been treated by assuming an innnite input dimension (thermodynamic limit) such that a mean dynamics analysis is exactt2]. We present a more realistic treatment by calculating corrections to the mean dynamics induced by nite dimensional inputss3]. We assume that the teacher network the student attempts to learn is a soft committee machine1] of N inputs, and M hidden units, this being a one hidden layer network with weights connecting each hidden to output unit set to +1, and with each hidden unit n connected to all input units by B n (n = 1::M). Explicitly, for the N dimensional training input vector , the output of the teacher is given by, = M X n=1 g(B n); (1) where g(x) is the activation function of the hidden units, and we take g(x) = erf(x= p 2). The teacher generates a stream of training examples (;), with input components drawn from a normal distribution of zero mean, unit variance.
منابع مشابه
Experimental and finite-element free vibration analysis and artificial neural network based on multi-crack diagnosis of non-uniform cross-section beam
Crack identification is a very important issue in mechanical systems, because it is a damage that if develops may cause catastrophic failure. In the first part of this research, modal analysis of a multi-cracked variable cross-section beam is done using finite element method. Then, the obtained results are validated usingthe results of experimental modal analysis tests. In the next part, a nove...
متن کاملOn Two-Echelon Multi-Server Queue with Balking and Limited Intermediate Buffer
In this paper we study two echelon multi-server tandom queueing systems where customers arrive according to a poisson process with two different rates. The service rates at both echelons are independent of each other. The service times of customers is assumed to be completed in two stages. The service times at each stage are exponentially distributed. At the first stage, the customers may balk ...
متن کاملOptimal Location and Sizing of Distributed Generations in Distribution Networks Considering Load Growth using Modified Multi-objective Teaching Learning Based Optimization Algorithm
Abstract: This paper presents a modified method based on teaching learning based optimization algorithm to solve the problem of the single- and multi-objective optimal location of distributed generation units to cope up the load growth in the distribution network .Minimizing losses, voltage deviation, energy cost and improved voltage stability are the objective functions in this problem. Load g...
متن کاملMulti-objective Crashworthiness Optimization of the Aluminum Foam-filled Tubes
In order to reduce both the weight of vehicles and the damage of occupants in a crash event simultaneously, it is necessary to perform a multi-objective optimization of the automotive energy absorbing components. In this paper, axial impact crushing behavior of the aluminum foam-filled thin-walled tubes are studied by the finite element method using commercial software ABAQUS. Comparison of the...
متن کاملCrop Land Change Monitoring Based on Deep Learning Algorithm Using Multi-temporal Hyperspectral Images
Change detection is done with the purpose of analyzing two or more images of a region that has been obtained at different times which is Generally one of the most important applications of satellite imagery is urban development, environmental inspection, agricultural monitoring, hazard assessment, and natural disaster. The purpose of using deep learning algorithms, in particular, convolutional ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2007